Behavioral Analysis of Benzoquinone Defensive Gland Usage by Bug

(c) 2022 Julian Wagner. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license. Code for non-parametric bootstrapping of linear regression slopes from work by Justin Bois.

In [68]:
import numpy as np
import pandas as pd
import bokeh
import bokeh.io
import bokeh.plotting
import bebi103
import bokeh_catplot
import re
import glob
bokeh.io.output_notebook()
Loading BokehJS ...

Study setup

Liometopum occidentale is an ecologically dominant ant of oak forests in the Southwest United States and Mexico. It has numerous symbionts, including several species of beetle, a cricket, as well as hemipterans. Pamillia behrensii is one associate of the L. occidentale. P. behrensii has a metathoracic gland with classic defensive compounds including benzoquinones, a potent irritant. To test whether the bug uses it's gland secretion for defense, we performed behavioral experiments coupled with solid phase micro-extraction and mass spectrometry to assess whether aggressive interactions with ants elicit gland usage. Here, we measured whether the amount of bug movement over the course of a behavior trial (a proxy for attempts to escape/avoid a threatening stimulus) correlates with amount of gland secretion deployed.

Pamillia Pamillia

Pamillia behrensii. The metathoracic gland can be seen on the underside of the bug, between second and third set of legs; it is the white curved structure.

Behavioral analysis

To assess gland deployment, we employed an infrared illuminated arena, and placed a bug with either 5 ants, 5 ants with glued mandibles (to reduce their aggression/potency against the bug), or alone. We chilled the animals on ice and loaded them into the arena. The trials were run for 30 minutes with SPME fiber deployed, before the SPME fiber was injected into a GCMS. To start the analysis, we begin by loading up metadata on the behavioral videos into a dataframe.

In [2]:
metadata = pd.read_csv("./pamillia_dlc_analysis/pamillia_spme_metadata.csv", index_col=0)
metadata['blob_file'] = [re.sub("DLC.*", "_dist_im_an.csv", re.sub("./pamillia_dlc_analysis/", "C:/Users/jwagne/git/pamillia_only-julian_wagner-2020-08-05/videos/", f)) for f in metadata['DLC_file']]
metadata.head()
Out[2]:
species ID date SPME_start_time SPME_stop_time SPME_total_time interactor order heptan-2-one heptyl acetate 2-ethyl-1-4-benzoquinone 2-ethyl-1-4-hydroquinone DLC_file x1_crop x2_crop pixPerMm timestamp_file blob_file
0 pamillia 1 20191107 59 620 561 5_locc 1 12138 484 10548 5249 ./pamillia_dlc_analysis/pamillia_1_cam_1_date_... 160 1230 30.571429 ./pamillia_dlc_analysis/pamillia_1_cam_1_date_... C:/Users/jwagne/git/pamillia_only-julian_wagne...
1 pamillia 1 20191107 56 1813 1757 5_locc 2 714989962 2887643 115947543 171544 ./pamillia_dlc_analysis/pamillia_1_round2_cam_... 150 1230 30.857143 ./pamillia_dlc_analysis/pamillia_1_round2_cam_... C:/Users/jwagne/git/pamillia_only-julian_wagne...
2 pamillia 2 20191107 69 1914 1845 5_locc 1 46007517 485025 1917440 628632 ./pamillia_dlc_analysis/pamillia_2_cam_1_date_... 150 1225 30.714286 ./pamillia_dlc_analysis/pamillia_2_cam_1_date_... C:/Users/jwagne/git/pamillia_only-julian_wagne...
3 pamillia 2 20191107 37 1941 1904 5_locc 2 1358301 197711 216459 83717 ./pamillia_dlc_analysis/pamillia_2_round2_cam_... 150 1225 30.714286 ./pamillia_dlc_analysis/pamillia_2_round2_cam_... C:/Users/jwagne/git/pamillia_only-julian_wagne...
4 pamillia 3 20191112 46 1817 1771 5_locc 1 48541 79062 85775 87096 ./pamillia_dlc_analysis/pamillia_3_round1_cam_... 150 1240 31.142857 ./pamillia_dlc_analysis/pamillia_3_round1_cam_... C:/Users/jwagne/git/pamillia_only-julian_wagne...

Peaks from the GCMS run were manually integrated for the four identified gland compounds to get total counts, a measure for total amount of each compound from the gland secretion. In order to look at the behavior of the bug, namely its movement over time, we used DeepLabCut (DLC). Markers for the head, two places on the thorax and two on the abdomen were manually annotated on 1033 frames from a variety of of videos. Some images were annotated as refinement of the trained network after identification of poorly labeled frames. The behavioral videos and the results of the annotation with DLC look as follows:

Behavioral data frame DLC annotated data frame

We can take a look at the chemical data to see amounts of gland compounds deployed in our different treatment conditions (5 ants [5_locc], 5 glued-mandible ants [5_locc_glued], or bug alone [alone]). We plot both raw $\mathrm{count}$ data as well as $log(\mathrm{count})$ data due to the large scale of the axis.

In [3]:
metadata['log heptan-2-one'] = np.log(metadata['heptan-2-one'])
metadata['log heptyl acetate'] = np.log(metadata['heptyl acetate'])
metadata['log 2-ethyl-1-4-benzoquinone'] = np.log(metadata['2-ethyl-1-4-benzoquinone'])
metadata['log 2-ethyl-1-4-hydroquinone'] = np.log(metadata['2-ethyl-1-4-hydroquinone'])

plots = []
for chem in ['heptan-2-one', 'heptyl acetate', '2-ethyl-1-4-benzoquinone', '2-ethyl-1-4-hydroquinone',
             'log heptan-2-one', 'log heptyl acetate', 'log 2-ethyl-1-4-benzoquinone', 'log 2-ethyl-1-4-hydroquinone',]:
    p = bokeh_catplot.strip(
        data=metadata.loc[metadata['SPME_total_time']>1600],
        cats='interactor',
        val=chem,
        jitter=True,
        horizontal=True,
        marker_kwargs=dict(alpha=0.5, size=8),plot_height=150,palette=bokeh.palettes.viridis(3), plot_width=500,title='Counts from integrated GCMS peak',
    )
    plots.append(p)

bokeh.io.show(bokeh.layouts.gridplot(plots, ncols=2))

We can see quite a bit of variability between trials, but a generally very high quantity of gland compound deployed when the bug interacted with ants, whereas a small/GCMS baseline amount present when the bug was alone in the arena. This indicates the bug actively uses the gland, rather than the compounds continuously diffusing from the gland. The stark difference in the distribution of amount of chemical used when the big faces an enemy vs is alone is more clearly seen with an empirical cumulative distribution of the data.

In [120]:
plots = []
for chem in ['heptan-2-one', 'heptyl acetate', '2-ethyl-1-4-benzoquinone', '2-ethyl-1-4-hydroquinone']:
    p = bokeh_catplot.ecdf(
        data=metadata.loc[metadata['SPME_total_time']>1600],
        cats='interactor',
        val=chem,
        marker_kwargs=dict(alpha=1), style='staircase', plot_width=500,palette=bokeh.palettes.viridis(3),title='Counts from integrated GCMS peak'
    )
    p.legend.location = 'bottom_right'
    p.output_backend = 'svg'
    p.outline_line_color = None
    plots.append(p)

bokeh.io.show(bokeh.layouts.gridplot(plots, ncols=2))

The distributions look pretty similar for the ant and glued ant conditions, and compound amount near to zero for the big alone. We can now look at whether the level of movement of the bug, a proxy for how much it engages in an escape response or how agitated it is, correlates with how much chemical it secreted from its gland during the trial. We define some helper functions to read in the DLC data and filter it.

In [5]:
def read_files(DLC_file, time_file, network, start, stop, SPME_time_buffer=20):
    df_pos = pd.read_hdf(DLC_file)
    df_pos_time = pd.read_csv(time_file)
    df_pos[(network, 'all', 't')] = df_pos_time['time']
    
    df_pos_med = df_pos.rolling(5).median()
    df_pos_part_med_x = np.median(df_pos_med.dropna()[[(network, 'bh', 'x'), (network, 'bt1', 'x'), (network, 'bt2', 'x'), (network, 'ba1', 'x'), (network, 'ba2', 'x')]], axis=1)
    df_pos_part_med_y = np.median(df_pos_med.dropna()[[(network, 'bh', 'y'), (network, 'bt1', 'y'), (network, 'bt2', 'y'), (network, 'ba1', 'y'), (network, 'ba2', 'y')]], axis=1)
    
    df_tidy = pd.DataFrame({'t':df_pos_med.dropna()[(network, 'all', 't')], 'x':df_pos_part_med_x, 'y':df_pos_part_med_y})
    if stop == 'full':
        df_tidy = df_tidy.loc[df_tidy['t']>start]
    else:
        df_tidy = df_tidy.loc[(df_tidy['t']>start) & (df_tidy['t']<stop)]
    return(df_tidy)

def load_DLC_medians(metadata, ind, SPME_time_buffer=20):
    
    DLC_file = metadata.iloc[ind]['DLC_file']
    time_file = metadata.iloc[ind]['timestamp_file']
    
    if ';' in metadata.iloc[ind]['DLC_file']:
        file1, file2 = re.match('([^;]+); *([^;]+)', DLC_file).groups()
        file1_time, file2_time = re.match('([^;]+); *([^;]+)', time_file).groups()
        network = re.search('(DLC.*).h5', file1)[1]
        start, stop = (metadata.iloc[ind]['SPME_start_time'], metadata.iloc[ind]['SPME_stop_time'])
        df1 = read_files(file1, file1_time, network, start, 'full')
        df2 = read_files(file2, file2_time, network, 0, stop)
        df2['t'] = df2['t'] + df1['t'].max()
        df_tidy = pd.concat([df1, df2])
    else:    
        network = re.search('(DLC.*).h5', DLC_file)[1]
        start, stop = (metadata.iloc[ind]['SPME_start_time'], metadata.iloc[ind]['SPME_stop_time'])
        df_tidy = read_files(DLC_file, time_file, network, start, stop)
    
    return(df_tidy)

With these functions ready, we can iterate through the the data and read in the position information of the bug in the arena over time. Note that there were some preliminary experiments that were shorter than 30 mins. With SPME, if the trial is not run for a comparable length of time, the chemical amounts cannot be directly compared. Hence, these short trials were excluded from the following analysis.

In [6]:
cols = {'5_locc':bokeh.palettes.viridis(3)[0], '5_locc_glued':bokeh.palettes.viridis(3)[1], 'alone':bokeh.palettes.viridis(3)[2]}
df_t_list, colors, interactors, IDs, BQs, pixPerMms, trialOrder, dists = ([], [], [], [], [], [], [], [])
for i in range(len(metadata)):
    if metadata.iloc[i]['SPME_total_time'] < 1500:
        continue
    df_t_list.append(load_DLC_medians(metadata, i))
    colors.append(cols[metadata.iloc[i]['interactor']])
    interactors.append(metadata.iloc[i]['interactor'])
    IDs.append(metadata.iloc[i]['ID'])
    BQs.append(metadata.iloc[i]['2-ethyl-1-4-benzoquinone'])
    pixPerMms.append(metadata.iloc[i]['pixPerMm'])
    trialOrder.append(metadata.iloc[i]['order'])
    try:
        df_dists = pd.read_csv(metadata.iloc[i]['blob_file'])
        dists.append(np.sum(df_dists.loc[df_dists['all_dists']< 5000].rolling(10).median()['all_dists'].dropna().values < 100)*(1/25)*(1/60))
    except:
        dists.append(0)

We can now check on what the positional data looks like. For this, we will plot the x-position of a bug over time. Each line is a different experiment, with the bug ID listed on the left. The y-axis represents the position in the arena over time, so movement up or down in that axis represents movement across the arena. The positional values are shifted and scaled to show all the trials.

In [7]:
p = bokeh.plotting.figure(plot_width=1500, x_axis_label='time (min)')
p.yaxis.visible=False
for i, df_t in enumerate(df_t_list):
    p.line((df_t['t']-df_t['t'].min())/60, 1.5*IDs[i]+trialOrder[i]/4+df_t['x']/6000, color=colors[i], legend_label=interactors[i], line_width=3)
    if trialOrder[i] == 1 or IDs[i]==1:
        mytext = bokeh.models.Label(x=-0.8, y=1.5*IDs[i]+trialOrder[i]/4, text=str(IDs[i]), text_align='center', text_font_size='18pt')
        p.add_layout(mytext)
        
mytext = bokeh.models.Label(x=-0.8, y=1.5*IDs[i]+trialOrder[i]/4, text='ID#', text_align='center', text_font_size='12pt')
p.add_layout(mytext)
p.legend.location='top_right'
bokeh.io.show(p)

We can see that there is quite a bit of variability in how much the bug moves in the various experiments. The more the line squiggle in the line, the more the animal is moving. Note that in some trials the movement trace rapidly oscillates up and down, which indicates that the bug is running around the edge of the circular arena (which is classically considered a stress response). For the next step, we will collapse all this movement data into a single number per trial, the distanced traversed by the animal over the 30 minute experiment. Though it loses a lot of the information, it gives one readout for how agitated the bug was during the trial. In order to get a statistical assessment of whether there is a correlation of gland chemical deployment with bug distance traveled, we first define a function to do a non-parametric bootstrap of the linear regression slope/intercept for the data. Due to the large magnitude/range of values for chemical amount and bug movement, we perform the regression on log transformed data.

In [8]:
def draw_bs_pairs_linreg(x, y, size=1):
    """Perform pairs bootstrap for linear regression."""
    # Set up array of indices to sample from
    inds = np.arange(len(x))

    # Initialize samples
    bs_slope_reps = np.empty(size)
    bs_intercept_reps = np.empty(size)

    # Take samples
    for i in range(size):
        bs_inds = np.random.choice(inds, len(inds))
        bs_x, bs_y = x[bs_inds], y[bs_inds]
        bs_slope_reps[i], bs_intercept_reps[i] = np.polyfit(bs_x, bs_y, deg=1)

    return bs_slope_reps, bs_intercept_reps

With this function ready, we can make a plot with a confidence interval for the slope of amount of chemical deployed vs distance of bug movement.

In [114]:
x_move = []
y_bq = []
for i, df_t, color, interactor, ID, BQ, pixPerMm in zip(range(len(df_t_list)), df_t_list, colors, interactors, IDs, BQs, pixPerMms):
    move = np.sqrt(np.square(df_t.shift()['x'] - df_t['x']) + np.square(df_t.shift()['y'] - df_t['y'])).sum()
    move = move/(pixPerMm*10)
    x_move.append(np.log(move))
    y_bq.append(np.log(BQ))

x_move = np.array(x_move)
y_bq = np.array(y_bq)
bs_slope_reps, bs_intercept_reps = draw_bs_pairs_linreg(x_move, y_bq, size=10000)

p = bokeh.plotting.figure(plot_height=400, plot_width=500, x_axis_label='log(cm traveled by bug during trial)', y_axis_label='log(BQ counts from GCMS of SPME)')
p.title.text = 'Slope 95% CI: ' + str([round(j, 3) for j in np.percentile(bs_slope_reps, [2.5, 97.5])])

# x-values
x = np.linspace(x_move.min(), x_move.max(), 200)
# y-values of each point
y = np.outer(bs_slope_reps, x) + np.stack([bs_intercept_reps]*200, axis=1)

# Compute percentiles for [2.5th and 97.5th], [12.5th and 87.5th], and [45.0th and 55.0th]
for percents in [[2.5, 97.5], [12.5, 87.5], [45.0, 55.0]]:
    low, high = np.percentile(y, percents, axis=0)
    p1 = np.append(x, x[::-1])
    p2 = np.append(low, high[::-1])
    p.patch(p1, p2, alpha=0.5, color='grey', line_color='grey')

p.circle(x_move, y_bq, color=colors, size=20)
p.outline_line_color = None

p.output_backend = 'svg'
bokeh.io.show(p)

Note that the gray intervals show the 10th, 75th, and 95th percentiles for a regression of data points generated via a non-parametric bootstrap. There is a lot of variability between trials in both the amount of chemical deployed and the movement of the insect! Even so, there is a significant positive association between amount of bug movement (a proxy for agitation/attempts to escape an enemy) and amount of gland chemical deployed. This gives a very strong indication that gland compound secretion is actively controlled by the bug and used in defense to help the animal escape a would-be predator or aggressor. One large confound for the movement measurement are cases where the ant is able to successfully grab/hold the bug. In such cases, the bug will not be able to run (reducing the movement) while it may still deploy the gland. Additionally, the movement measure does not directly capture the level of aggressiveness of the ants, but instead is a measure of the attempted escape/agitation of the bug. Correlation of some other behavioral metrics, like how much time spent close to ants or how much the ants moved during th experiment, with amount of chemical deployed would provide additional insight into the bug's pattern of usage of it's defensive gland.

We will now look at only the trials where the bug was interacting with 5 ants without glued mandibles. We are interested in whether the presence of long bouts where the bug is in close proximity to an ant (likely cases where the ant has successfully grabbed the bug and is latched onto it) correlate with deployment of more gland compound. This would give further evidence that the bug deploys its gland as a defense response when encountering aggression from another animal/needs to escape. We will first load in metadata about the analysis that we will use to quantify this.

In [10]:
metadata = pd.read_csv("./pamillia_dlc_analysis/pamillia_spme_metadata.csv", index_col=0)
metadata['blob_file'] = [re.sub("DLC.*", "_dist_im_an.csv", re.sub("./pamillia_dlc_analysis/", "./distance_measures/", f)) for f in metadata['DLC_file']]
metadata.head()
Out[10]:
species ID date SPME_start_time SPME_stop_time SPME_total_time interactor order heptan-2-one heptyl acetate 2-ethyl-1-4-benzoquinone 2-ethyl-1-4-hydroquinone DLC_file x1_crop x2_crop pixPerMm timestamp_file blob_file
0 pamillia 1 20191107 59 620 561 5_locc 1 12138 484 10548 5249 ./pamillia_dlc_analysis/pamillia_1_cam_1_date_... 160 1230 30.571429 ./pamillia_dlc_analysis/pamillia_1_cam_1_date_... ./distance_measures/pamillia_1_cam_1_date_2019...
1 pamillia 1 20191107 56 1813 1757 5_locc 2 714989962 2887643 115947543 171544 ./pamillia_dlc_analysis/pamillia_1_round2_cam_... 150 1230 30.857143 ./pamillia_dlc_analysis/pamillia_1_round2_cam_... ./distance_measures/pamillia_1_round2_cam_1_da...
2 pamillia 2 20191107 69 1914 1845 5_locc 1 46007517 485025 1917440 628632 ./pamillia_dlc_analysis/pamillia_2_cam_1_date_... 150 1225 30.714286 ./pamillia_dlc_analysis/pamillia_2_cam_1_date_... ./distance_measures/pamillia_2_cam_1_date_2019...
3 pamillia 2 20191107 37 1941 1904 5_locc 2 1358301 197711 216459 83717 ./pamillia_dlc_analysis/pamillia_2_round2_cam_... 150 1225 30.714286 ./pamillia_dlc_analysis/pamillia_2_round2_cam_... ./distance_measures/pamillia_2_round2_cam_1_da...
4 pamillia 3 20191112 46 1817 1771 5_locc 1 48541 79062 85775 87096 ./pamillia_dlc_analysis/pamillia_3_round1_cam_... 150 1240 31.142857 ./pamillia_dlc_analysis/pamillia_3_round1_cam_... ./distance_measures/pamillia_3_round1_cam_1_da...

Using the joint locations from the DeepLabCut annotations, we matted out the location of the bug in each frame. We then performed bg subtraction and thresholding to get the location of all the other blobs in the video, which represents the location of the ants in the frame. With the bug joint locations and and blobs, we calculated the minimal distance between any of the bug joint locations from the DLC labels and an ant in each frame, and saved this out to a csv. A visualization of the analysis is below.

Distance between animals calculated from behavioral data

We calculated the smallest distance (shown with the white line) between the bug and ant in each frame. Bug location was matted out based on location of DLC labels (matting shown in purple) and thresholding was used to acquire locations of all other animals (shown in yellow). Minimals distance between the blob locations and the bug joints was then calculated.

With distance between ant and bug for all videos and frames, we can proceed! For our analysis here, we are interested in what happens to total measured secretion amount when the bug gets grabbed by the ant. This will look like bouts of time where the distance between the ant and bug is small for an extended period of time. The bug body length is about 100 pixels, so we consider close run ins with ants as distances less than 100 pixels apart. We consider periods where the bug is within this short distance near the ant for more than 8 seconds, quite a long time to sit around near an ant unless you have been snagged. We load in the data and calculate the amount of time the bug is 'stuck' by an ant based on our definition, i.e. within a body length for 8 or more seconds.

In [111]:
dists = glob.glob("./distance_measures/*")

stuck_thresh = 8

files = []
closes = []
close_blocks = []
for f in dists:
    df = pd.read_csv(f, index_col=0)
    #df.rolling(10).median().dropna()
    df['close_block'] = (df['all_dists'] < 100).astype(int)
    blocks = []
    close_blocks.append(0)
    for ff, dd in df.groupby((df['close_block'].shift() != df['close_block']).cumsum()):
        if not dd['close_block'].values[0] == 1:
            continue
        time_together = len(dd)/25
        if time_together > stuck_thresh:
            close_blocks[-1] += time_together
            
    files.append(re.sub('\\\\',"/", f))
    closes.append(len(df.loc[df['all_dists'] < 100]))
    
dist_df = pd.DataFrame({'blob_file':files, 'closes':closes, 'stuck_together': close_blocks})

df = metadata.merge(dist_df, on='blob_file')
df = df.loc[(df['interactor'] == '5_locc') & (df['SPME_total_time'] > 1600)]
df.head()
Out[111]:
species ID date SPME_start_time SPME_stop_time SPME_total_time interactor order heptan-2-one heptyl acetate 2-ethyl-1-4-benzoquinone 2-ethyl-1-4-hydroquinone DLC_file x1_crop x2_crop pixPerMm timestamp_file blob_file closes stuck_together
1 pamillia 1 20191107 56 1813 1757 5_locc 2 714989962 2887643 115947543 171544 ./pamillia_dlc_analysis/pamillia_1_round2_cam_... 150 1230 30.857143 ./pamillia_dlc_analysis/pamillia_1_round2_cam_... ./distance_measures/pamillia_1_round2_cam_1_da... 13101 385.16
2 pamillia 2 20191107 69 1914 1845 5_locc 1 46007517 485025 1917440 628632 ./pamillia_dlc_analysis/pamillia_2_cam_1_date_... 150 1225 30.714286 ./pamillia_dlc_analysis/pamillia_2_cam_1_date_... ./distance_measures/pamillia_2_cam_1_date_2019... 172 0.00
3 pamillia 2 20191107 37 1941 1904 5_locc 2 1358301 197711 216459 83717 ./pamillia_dlc_analysis/pamillia_2_round2_cam_... 150 1225 30.714286 ./pamillia_dlc_analysis/pamillia_2_round2_cam_... ./distance_measures/pamillia_2_round2_cam_1_da... 887 33.56
4 pamillia 3 20191112 46 1817 1771 5_locc 1 48541 79062 85775 87096 ./pamillia_dlc_analysis/pamillia_3_round1_cam_... 150 1240 31.142857 ./pamillia_dlc_analysis/pamillia_3_round1_cam_... ./distance_measures/pamillia_3_round1_cam_1_da... 8211 20.16
6 pamillia 4 20191112 37 1768 1731 5_locc 1 32219379 803154 1828588 252432 ./pamillia_dlc_analysis/pamillia_4_round1_cam_... 145 1235 31.142857 ./pamillia_dlc_analysis/pamillia_4_round1_cam_... ./distance_measures/pamillia_4_round1_cam_1_da... 7490 0.00

The data looks to have merged with our metadata on the secretion amounts from the SPME GCMS! We are only considering the trials where the ants were not glued here, as they are the only condition where the ant can truly grab hold of the bug. We plot the amount of time stuck with the ant against the measured amount of benzoquinone during the trial.

In [112]:
p = bokeh.plotting.figure(plot_height=400, x_axis_label='time close to ant (minutes)', y_axis_label='log BQ counts')
pal = bokeh.palettes.viridis(3)
pal_dict = {'5_locc':pal[0], '5_locc_glued':pal[1], 'alone':pal[2]}
df['colors'] = [pal_dict[i] for i in df['interactor'].values]

p.scatter(df['stuck_together']/60, np.log(df['2-ethyl-1-4-benzoquinone']+1), color=df['colors'], size=15)

bokeh.io.show(p)

We see a trend that the more time spent super close to the ant the more secretion deployed. All of the highest secretion experiments also had substantial periods during the trial where the animal was very close to the ant. THis suggests that the bug can actively deploy its gland to a greater extent when predators like ants successfully grab onto it as a way to confuse or deter it from continuing an attack. We can do some stats on this trend with non-parametric bootstrapping, as before.

In [115]:
x_move = np.array(df['stuck_together'].values)/60
y_bq = np.array(np.log(df['2-ethyl-1-4-benzoquinone'].values+1))
bs_slope_reps, bs_intercept_reps = draw_bs_pairs_linreg(x_move, y_bq, size=10000)

p = bokeh.plotting.figure(plot_height=400, plot_width=500, x_axis_label='minutes within a body length of ant for more than 8 seconds', y_axis_label='log(BQ counts from GCMS of SPME)')
p.title.text = 'Slope 95% CI: ' + str([round(j, 3) for j in np.percentile(bs_slope_reps, [2.5, 97.5])])

# x-values
x = np.linspace(x_move.min(), x_move.max(), 200)
# y-values of each point
y = np.outer(bs_slope_reps, x) + np.stack([bs_intercept_reps]*200, axis=1)

# Compute percentiles for [2.5th and 97.5th], [12.5th and 87.5th], and [45.0th and 55.0th]
for percents in [[2.5, 97.5], [12.5, 87.5], [45.0, 55.0]]:
    low, high = np.percentile(y, percents, axis=0)
    p1 = np.append(x, x[::-1])
    p2 = np.append(low, high[::-1])
    p.patch(p1, p2, alpha=0.5, color='grey', line_color='grey')

p.circle(x_move, y_bq, color=pal[0], size=20)

p.outline_line_color = None

p.output_backend = 'svg'
bokeh.io.show(p)

The confidence interval for the slope indicates a significant and positive trend for the duration of time very close to ants and the log of benzoquinone secretion amount. This provides evidence for gland usage as an escape tactic when the bug gets pinned by a predator.